AITopics | sparsity pattern

Optimal transport (OT) has emerged as a fundamental tool in modern machine learning, yet its computational cost remains a significant bottleneck for large-scale applications. While harnessing the massive parallelism of modern GPU hardware is critical for efficiency, the de facto standard Sinkhorn algorithm, despite its ease of parallelization, often suffers from slow convergence in challenging problems. More recently, the sparse-plus-low-rank quasi-Newton method offers a balance between convergence rate and per-iteration complexity; however, its efficiency on GPUs is severely hindered by the serial nature of sparse matrix symbolic analysis and irregular memory access patterns. To bridge this gap, we present cuRegOT, a high-performance GPU solver tailored for entropic-regularized OT. We introduce a suite of algorithmic and architectural optimizations, including an amortized symbolic analysis strategy to mitigate CPU bottlenecks, an asynchronous Sinkhorn iterates generation mechanism, and a fused kernel for bandwidth-efficient gradient evaluation. These strategies are backed by rigorous theoretical guarantees ensuring algorithmic convergence. Extensive numerical experiments demonstrate that cuRegOT achieves significant speedups over state-of-the-art GPU-based solvers across a variety of benchmark tasks.

artificial intelligence, problem size, sinkhorn, (17 more...)

arXiv.org Machine Learning

2605.08793

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

Add feedback

New Bounds for Hyperparameter Tuning of Regression Problems Across Instances

Neural Information Processing SystemsApr-30-2026, 10:08:17 GMT

The task of tuning regularization coefficients in regularized regression models with provable guarantees across problem instances still poses a significant challenge in the literature. This paper investigates the sample complexity of tuning regularization parameters in linear and logistic regressions under ℓ1 and ℓ2-constraints in the data-driven setting. For the linear regression problem, by more carefully exploiting the structure of the dual function class, we provide a new upper bound for the pseudo-dimension of the validation loss function class, which significantly improves the best-known results on the problem. Remarkably, we also instantiate the first matching lower bound, proving our results are tight. For tuning the regularization parameters of logistic regression, we introduce a new approach to studying the learning guarantee via an approximation of the validation loss function class. We examine the pseudo-dimension of the approximation class and construct a uniform error bound between the validation loss function class and its approximation, which allows us to instantiate the first learning guarantee for the problem of tuning logistic regression regularization coefficients.

artificial intelligence, function class, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Netherlands (0.28)

Genre: Research Report > New Finding (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

BayesTune: Bayesian Sparse Deep Model Fine-tuning

Neural Information Processing SystemsApr-29-2026, 19:48:41 GMT

Deep learning practice is increasingly driven by powerful foundation models (FM), pre-trained at scale and then fine-tuned for specific tasks of interest. A key property of this workflow is the efficacy of performing sparse or parameter-efficient finetuning, meaning that by updating only a tiny fraction of the whole FM parameters on a downstream task can lead to surprisingly good performance, often even superior to a full model update. However, it is not clear what is the optimal and principled way to select which parameters to update. Although a growing number of sparse fine-tuning ideas have been proposed, they are mostly not satisfactory, relying on hand-crafted heuristics or heavy approximation. In this paper we propose a novel Bayesian sparse fine-tuning algorithm: we place a (sparse) Laplace prior for each parameter of the FM, with the mean equal to the initial value and the scale parameter having a hyper-prior that encourages small scale.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Workflow (0.48)
Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

1afb9ca4adf1d9cb3c87ff3e22a29049-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 22:04:28 GMT

machine learning, natural language, sparsity, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

1afb9ca4adf1d9cb3c87ff3e22a29049-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 22:04:25 GMT

machine learning, natural language, sparsity, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

9ed27554c893b5bad850a422c3538c15-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 05:09:32 GMT

However, these models suffer from quadratic computational cost in the input sequence lengthn to compute pairwise attention in each layer.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

New Bounds for Hyperparameter Tuning of Regression Problems Across Instances

Neural Information Processing SystemsFeb-18-2026, 03:22:00 GMT

In this work, we investigate an alternative setting for tuning regularization parameters, namely data-driven algorithm design, following the previous line of work by Balcan et al. [

artificial intelligence, machine learning, rlr, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

BayesTune: Bayesian Sparse Deep Model Fine-tuning

Neural Information Processing SystemsFeb-17-2026, 04:42:08 GMT

Deep learning practice is increasingly driven by powerful foundation models (FM), pre-trained at scale and then fine-tuned for specific tasks of interest.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: